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Identification of American English 
Initial /!/ and /r/ 

By Native Speakers of Japanese 

Ahatvaot 

616 CVC stimulus syllables beginning with /!/ and /r/ 
were presented by tape recording to six Japanese foreign 
students at Indiana University. Analysis of variance showed 
the effect of the final C variable on the identification of 
initial /!/ and /r/ was significant at the .05 level; the 
effect of the CV interaction was significant at the .01 
level. Response scores for the CV interaction pairs 
correspond closely to the magnitude of the third formant 
shift from /!/ and /r/ to the various syllable nuclei 
determined by Lehiste. Specific results, correlations 
with frequency of occurrence and acoustic data, and 
implications for improving /!/ and /r/ identification by 
native speakers of Japanese are discussed. 



Introduction 



Extensive knowledge of the phonology, syntax, culture, 
and values of the society makes much of the speech infor- 
mation transmitted between two native speakers of a 
particular language redundant. Because of redundancy, 
total comprehension can take place when the surface 
constructions are ambiguous and acoustic interference or 
attenuation seriously distorts much of the information 
originally transmitted by the speaker. The loss of a single 
phonemic or syntactic distinction would rarely have any 
effect on the comprehension of a native speaker. 

The non-native listener does not have the same mastery 
of phonology or syntax, nor the knowledge of the culture 
and the values of society. Thus, much of the speech 
information that would have been redundant for a native 
listener would not be so for a non-native listener. The 
non-native listener is either unable to decode, or decodes 
incorrectly much of the phonological and syntactic infor- 
mation. In addition, the non-native speaker's lack of 
knowledge of the culture and values of the society make 
more information necessary to assure total comprehension 
of the message. As a result, surface ambiguities Cambiguous 
surface constructions) and loss of information through 
acoustic interference and attenuation cause even more 
difficulty in comprehension for the non-native listener 
than for the native listener. The loss of even a single 
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phonemic or syntactic distinction could readily lead to a 
comprehension breakdown. 

The ultimate goal of the second -language learner 
desiring mastery would be the perception and interpretation 
of the same number of cues in the same way as the native 
speaker. Since it is a very long-term project to provide 
the non-native speaker with extensive experience nd 
knowledge of the culture and values of the society, initial 
efforts to improve comprehension mastery are better 
concentrated on the perception (de<’oding) of phonological 
and syntactic cues. Basic to the decoding of syntactic 
cues, is the decoding of phonological cues. Even though 
syntactic cues may be used to reconstruct phonological units 
as well as visa versa, there must first be some phonological 
cues in order to detect any syntactic information. 

The approach to designing a second -language instruc- 
tional program is often limited to: 

1) The preparation of a comparative analysis of the 
two languages in search of areas of possible interference. 
Formerly, this was done more often by comparison of phonemic 
analyses and syntactic taxonomies of the two languages. 

More recently, investigators have taken to comparing the 
various phonological and syntactic transformational grammars 

of each particular language. 

2) Alternatively, ’’skeptics’’ of comparative analysis 
not recognizing a great contribution to instruction through 
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such an approach, have often limited themselves to 
uncontrolled superficial observations of certain second- 
language learning problems in the classroom. Such casual 
observations as, ’’The Japanese cannot pronounce /!/ , might 

be expected. 

Although it is understandable that ’’skeptics” find 
little practical use for comparative analysis in devising 
and designing a meaningful instructional program for these 
problems, at least as comparative analysis is commonly 
applied, it is frankly difficult to see how casual recol- 
lections of classroom experience could be of any greater 

value . 

While comparative analysis arrives at a prediction of 
problems and problem areas in second-language learning, 
these predictions are obviously worthless if they do not 
correspond to the problems the students actually have. The 
prediction from a comparative analysis that speakers of 
German would experience confusion between American English 
(AE) /t/ and /0/ when in fact a few minutes observation of 
a learner would indicate a confusion between AE /s/ and 
/e/, is plainly a waste of time. 

The implications of various parts of a confirmed 
prediction are also unclear. The comparison of two analyses 
carried out according to certain theory-specific adequacy 
and economy criteria deleting certain redundant information, 
cannot be expected to arrive at conclusions which are the 
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result o£ the interaction o£ all o£ the learner and language 
facts ♦ 

On the other hand, casual observations of the ’’skeptics’* 
afford no indications of underlying factors of the learning 
problems. They seem only to provide fertile seed beds for 
the growth and spread of personal peeves and irresponsible 
i peculation. 

These remarks are not to be taken by any means as a 
total condemnation of the proponents of comparative analysis 
or the ’’skeptics.” Proponents of comparative analysis can 
provide the linguistic facts and the system for utilizing 
these facts; the ’’skeptics” can provide the observations 
which can validate or reject basic premises and thus 
activate the system for utilizing these facts. As pointed 
out by Anderson (1964), investigation which considers the 

basic linguistic facts and experimentally takes the learner 
into account each step of the way will avoid the pitfalls 
of counterf active projection and overgeneralization from 
superficial observation, and thus can contribute valuable 
information for use by instructional program designers. 

Those coming in contact with Japanese who speak English 
as a second language readily notice that the Japanese often 
pronounce the AE /!/ and /r/ the same to AE ears. Less 
obvious to non-Japanese is the difficulty the speakers o^ 
Japanese experience in perceiving AE /!/ and /r/. Initial 
pilot tests indicated that the Japanese had considerably 



4 



less success in identifying /!/ and /r/ produced by a 
speaker of AE than the speakers of AE had in identifying 
/!/ and /r/ produced by the speakers of Japanese. Nakajima 
(1957) explicitly mentions that it is "next to impossible 
for Japanese speakers of English to discriminate them in 

hearing. . 

Because of the high degree of contusability of AE /!/ 
and /r/ for Japanese exchange students, the relatively high 
frequency of occurrence and information load, and the fact 
that mos*' of the Japanese exchange students can scarcely 
tolerate such loss of information due to problems in syntax 
and lack of experience in the culture, extensive testing of 
the identification of AE /!/ and /r/ by Japanese was chosen 
first for investigation. 

The use of "allophone" in this paper has been limited 
to those phoneme variants, distributionally described, that 
are inherently different from one another in terms of a 
feature. Thus, the final [1] and initial [1] are two 
allophones of phoneme /!/ , distributionally predicted, that 
differ with respect to certain, let us say, articulatory 
features. Yet, they are certainly the same in many others. 

In addition to "distributional" allophones there are 
"contextual" variants. As pictured here, a "contextual" 
variant would be the manifestation of a particular 
"distributional" allophone in contiguity with other phones, 
fhe (target) articulatory features at the point of maximum 
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closure would be the same for two "contextual” variants. 

As the context for two "contextual" variants differs by 
definition, the transition either from the preceding phone 
or to the following phone would necessarily be different. 

For example, the transitions from [1] to [o] are quite 
different from those to [e] (cf., Figure 13). Yet, the 
starting Fs for the two are very nearly the same (cf., 

Figures 1 and 12) . 

Since the transitions to and from a particular 
consonant phoneme, such as /!/ , are considered part of that 
phoneme, and play a valuable role in the detection and 
perception of the consonant phoneme, it follows that there 
are variants from context to context, independent of the 
articulatory position of maximum closure. (In the case of 
stops such as [t] , there is at most only the sound of air 
turbulence- -nothing at all if the [t] is unaspirated. The 
only perceivable portion in that case is the transition 
(Liberman, Delattre, Cooper, ^ Gerstman, 1954). 

Initial pilot checks indicated that Japanese students 
seemed to have less trouble in identifying final /!/ and /r/ 
/!/ and /r/ in cluster, including the most extreme allo- 
phonic variation, seemed to cause all the students so much 
trouble that there would probably be little variation of 
identification scores from context to context. In order to 
maintain the complexity of the experimental design within 
reasonable limits, the investigation was limited to initial 
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/!/ and /r/ until the effects of context had been determined. 

Such investigation would entail controlled experi- 
mentation whereby a number of Subjects (Ss) would be chosen 
at random from a target population of Japanese exchange 
students. AE initial /!/ and /r/ would be presented in 
various contexts to the Japanese Ss for phonemic identi- 
fication. The across-S differences in identifiability for 
the various contexts would be compared with acoustic 
characteristics and frequency of occurrence of the stimuli. 
Such comparison of across-S differences with linguistic 
facts of the stimulus leads to a more basic understanding 
of the events surrounding the /!/ and /r/ identification 
problem, and can therefore ultimately lead to more 

successful instructional designs. 

The next three sections contain a short articulatory 
description of AE initial /!/ and /r/ and Japanese /r/ and 
a somewhat more extensive acoustic description of the 
variants of AE initial /!/ and /r/ in various contexts 
according to the Lehiste (1964) data. These are followed 
by a short section summarizing the implications of the 
acoustic and articulatory evidence for the identification 
of AE initUl /!/ and /r/ by speakers of Japanese. 



Avtioulatovy DesoHption of AB /I/ and /v/ 

AE initial allophone o£ /!/ is a voiced apico-alveolar 
lateral vocoid. The initial variety begins with the tongue 
position approximately that for AE [t] , Id] , [n] , but with 
the sides of the tongue lowered so that air passes over and 
out the sides without friction. The end of the initial /!/ 
is characterized by opening of a center channel for air 
passage and transition to the position of the following 

vowel. 

AE initial allophone of /r/ is a voiced apico-alveolar 
retroflex semivowel. According to Francis (1958), the sides 
of the tongue are against the back teeth in contrast with 
the initial /!/. "The blade and tip are turned upward and 
withdrawn a bit toward the back of the mouth, the tip points 
to the extreme back of the alveolar ridge where it joins the 
palate, considerably back of the position of contact for the 
alveolar consonants [t] , [d] , and [n] . From this position 
the apex flicks rapidly forward and down into the position 

for the following vowel." 

Prator (1951) notes as well that the lips are open and 
comments more fully on the manner of articulation. "In 
whatever direction the movement may end, it always begins 
by a motion toward the back of the mouth. More than any 
other factor, it is this retroflex (toward the back) motion 
that gives the English [r] its typical sound. The tongue 
tip rises a little and is curved backward, while the sides 
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of the tongue slide along the back part of the tooth ridge 
as along two rails." Similarly Jones, quoted in Toyoda 
(1957), describes the AE /r/ articulation as, " a general 
retraction of the whole body of the tongue with simultaneous 
lateral contraction." 

Avtioulatovy DeacHpHon of Japanese /v/ 

Roughly corresponding to AE /!/ and /r/ is the Japanese 
/r/ described by Bloch (1950) as a short alveolar flap. 

Bloch reported the Japanese /r/ to consist of two allophones 
in most speakers, three in some. The [r] , a short voiced 
alveolar flap occurs before [e, a, o, u] . The [r] , a short 

alveolar palatalized flap, occurs before [i,y]. For some 
speakers [1] stands in free variation with [r] before [e,o]. 

The allophones of both AE /!/ and /r/, while similar 
with respect to the point of articulation for the allophones 
of Japanese /r/ are quite different in manner of articu- 
lation. The initial AE and Japanese [1] are both alveolar, 
lateral, and voiced. The AE [1] is a vocoid, the Japanese 
[1] a flap. The initial AE and Japanese [r] are both 
alveolar and voiced, but the AE [r] is also retroflexed 
and a semivowel, the Japanese [r] , a flap. (Kimizuka (1962) 
considers the point of articulation of the Japanese /r/ and 
AE /r/ to be different. Kimizuka contends the Japanese 
/r/ "is produced by the movement of the tip of the tongue 
touching the palate and releasing it." Elsewhere he makes 



the general statement that the sound of Japanese /r/ is 
’’somewhere between the American /r/, /!/ and /d/.”) 



Acoustic Description of AB Initial /I/ and /v/ 
According to the Dehiste (1964) Data 

Figures 1, 2, and 3 contain the mean formant frequen- 
cies of AE initial /!/ and /r/ when followed by 11 differ- 
ent vowel nuclei, as well as the steady state frequencies 
of the vowel nuclei themselves. These various mean formant 
frequencies were computed from the spectograms of the 
pronunciation of five native speakers of AE by Lehiste. 

Figures 1 and 2 contain the mean frequencies of F^ and 
F 2 - Figure 1 contains /!/ in the various contexts; Figure 
2 contains /r/. The direction of increasing frequency has 
been adjusted so that the vowels assume positions approxi- 
mately those in an articulatory or perceptual triangle 
chart (Hanson, 1960). Thus, fronting is toward the left, 
backing toward the right. High is toward the top of the 
sheet, low toward the bottom. 

The straight line drawn from the point of F^ and F 2 
for the initial consonant, C(Fi,F 2), to the point of Fi 
and F 2 for the vowel nucleus, V(Fi,F 2 ), does not mean the 
transition actually followed the shortest path between 
'*hose two points. The straight line serves to visually 
locate the two points of a phonemic sequence on the chart. 

The mean of all the Fj^ and F 2 for /!/, /l/(F 3 ^,F 2 )i is 
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not very different from /r/(F^,F 2 )# But /l/(Fji,F 2 ) varies 
much more with respect to the following vowel context, than 
does /r/ » The values for /l/CF^iF^) are contained 
within an area bounded by 250<F;|L^220cps » and 850<F2<1200cps . 
while the values for contained in an area 

bounded by 255<F^<305eps. and 870<F2<990cps . There are 
some differences in the V(FjL,F 2 ) preceded by /!/ from those 
preceded by /r/, but no general trend could be easily 
defined. 

Figure 3 contains the mean frequencies for F 2 and F 3 
for both AE /!/ and /r/ preceding 11 vowel nuclei. As an 
aid to reading the graph, F 2 was plotted along the horizon- 
tal axis increasing to the left as it wa*' for Figures 1 and 
2 . F 3 is plotted along the vertical axis increasing towards 
the top of the graph. The straight lines drawn from 
C(F 2 ,F 3 ) to V(F 2 »F 3 ) serve to visually locate the two points 
of a phonemic sequence on the graph. 

There was little variation for / 1 /(F 3 ) and /r/CF 3 ) with 
respect to the following vowel context. In most instances 
the VCF 3 ) preceded by /!/ is slightly lower than V(p 3 ) 
preceded by /r/. The effect is minimal however. The 
variation of / 1 /(F 2 ,F 3 ) seen in Figure 3 is due to the 
variation of /I/CF 2 ) pointed out in Figure 1. 

The only striking difference between the formant 
frequencies of /!/ and /r/ occur in the third formant as 
has been indicated by Joos (1948); O’Connor, Gerstman, 
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Liberman, Delattre, § Cooper (1957); Lisker (1957), and 
others as well as Lehiste herself. 

Bummavy of AQOuatic and Avtioutatovy 
Evidence on Identification 

Native speakers of Japanese would not be expected to 
phonemically distinguish between lateral and non-lateral 

sounds since according to Bloch (1950), 

1) the lateral and non- lateral alveolar flaps are in 

free variation for some speakers j and 

2) the lateral flap does not even exist for most 
speakers of Japanese. Since all the allophones of Japanese 
/r/ are flaps, the Japanese would not be expected to be 
sensitive to duration and abruptness cues in connection 
with AE ^1/ and /r/. Of course, the Japanese Ss already 
speak English and would therefore be expected to identify 
AE /!/ and /r/ correctly to some extent. 

The identification of AE /!/ and /r/ was expected to 
be affected by allophonic variation. Each major allophonic 
variation constitutes a different stimulus which must be 
perceived and relegated to the appropriate phoneme. This 
was borne out in pilot checks where identification scores 
for AE /!/ and /r/ in cluster were considerably lower than 
the scores for the initial allophones. AE /!/ and /r/ in 
cluster were not included in this study in order to 
maintain the complexity of the experimental design within 
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reasonable limits until the effects on the non-allophonic 

varying context had been examined. 

Phonological context would be expected to affect 



identification through other than strict allophonic 
variation. Results in French and Steinberg (1947), Luescher 
and Zwislocki (1949), Moser and Dreher (1955), Hardy (1956), 
and Calearo and Laszaroni (1957) , along with others would 
tend to indicate that initial /!/ and /r/ mxght not be 
perceived only in terms of inherent (self-contained) 
characteristics, but also in comparison with (in contrast 
with) contiguously occurring phonemes. Results of this 
study would therefore not only point out troublesome 
phonemic sequences, but would have bearing on which context 
dimensions had an effect on identification and which 



features of the stimuli the Ss seemed to be perceiving 



Hxperimental Design and Equipment 

There were three stimulus variables, the first of 
which was I , the syllable initial consonant variable with 
two levels; /I, r/- The second, V, was the vowel nucleus 

with 14 levels: /i, I. e, e, m, a, o, o, U, u, e, al, aU, 

01/ . The third, C, was the final consonant vari-ble with 

22 levels: /p, b, t, d, k, g, f, v, 9, 8, s, z, S, i, 8, 3, 
m, n, I), 1, r, 0/- C0 represents "without final consonant.") 

The 616 stimulus syllables consisted of all possible 
combination of levels for the three variables. In addition 



to the original list of stimulus syllables in phonemic 
transcription, a second list of the syllables in conven- 
tional English orthography was prepared for use on the 
response sheet, 

A randomized list of the 616 stimulus syllables was 
prepared and then read onto tape by the experimenter. The 
syllables were pronounced at 3-second intervals. After 
the pronunciation of every fifth syllable on the list, the 
number of the next stimulus syllable was given. 

For each oral stimulus recorded on tape, two syllables 
appeared on the response sheet: the conventional English 

spelling of the stimulus syllable and its /!/ or /r/ count- 
erpart, The left-right position of the /!/ syllable was 
assigned randomly, the /r/ syllable taking the opposite 
position. 

For example: 1. lace race 

2. rid lid etc. 

The six Japanese Ss, who were students at Indiana 
University, were tested individually. Each S was placed 
at a table equidistant from the two high quality loud- 
speakers, The distance from the speakers and the sound 
level were the same for all six sessions. 

The instructions were presented to the S in both 
written (cf.. Appendix A) and oral form. The instruction 
sheet was separate from the response sheets and contained 
six examples. The oral form was played back from tape 
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through the same system and under the same conditions as for 
the presentation of the stimulus syllables. None of the six 
Ss requested further explanation after presentation of the 

instructions , 

All recordings were made at the language laboratory 
recording studios in Ballantine Hall, Indiana University 
on Ampex recorders. The tapes were recorded at 7.5 inches/ 
second (19cm, /sec.) half-track. Playback for testing was 
accomplished on a Telefunken 96k tape recorder connected 
to a Fisher Model 800 radio-amplifier system. The 
transmission was judged clear after testing two native 
speakers of English. Out of 616 responses, there were four 
errors for one S, and six for the other. The two native 
Ss had no error in common, and therefore the loss in the 
system was considered to be negligible. 

Experimental Results 

The summary of the analysis of variance can be found 
in Table 1. The means for the various levels of each of 
the four variables can be found in Table 2. 

This experiment was considered to be an expansion of 
the A X B X S design in Lindquist (1953) to an A x B x C x S 
design. As in Lindquist, the error term used for each F 
test was the interaction of the variable (or variables) to 

be tested with the Ss. 

The analysis of variance showed the C, the ending of 
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TABLE 1 

Summary of Analysis of Variance 
Grand Mean * 0# 73404 



Source of 
Variation 


Degrees of Mean 

Freedom Squares 


F 


P 


I 


1 


6.84010 


1.52161 


N,S. 


V 


13 


0.1S337 


.87590 


N.S. 


C 


21 


0.21904 


1.75922 


<.05 


S 


5 


14.84118 






IV 


13 


0.61457 


2.57142 


<.01 


IC 


21 


0.28738 


1.31368 


N.S. 


IS 


5 


4.49529 






VC 


273 


0.17356 


1.12054 


N.S. 


VS 


65 


0.17510 






CS 


105 


0.12451 






IVC 


273 


0.17174 


1.02470 


N.S. 


IVS 


65 


0.23900 






ICS 


105 


0.21876 






vcs 


1365 


0.15489 






Residual 


1365 


0.16760 






Total 


3695 









I - Syllable Initial Consonant 
V ■ Vowel Nucleus 
C ■ final Consonant 
S * Subject 
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TABLE 2 



Marginal Means 



Variables 


Levels 


Means 


Variables 


Levels 


Means 


I 


X* 


0.69102 






0.72619 




r- 


0.77706 




-V 


0.75595 


V 




0.67803 






0.66071 




• I * 


0.72727 




-a 


0.66071 




-e- 


u. 73485 




-s 


0.75595 






0.76515 




-z 


0.74405 




“8« 


0.72727 




-1 


0.66667 




-a* 


0.75000 






0.70238 




-0* 


0.74621 




-6 


0.76190 




- O' 


0.7045S 






0.74405 




-u- 


0.73485 




*=m 


0.73810 




-u- 


0.74242 




- n 


0.76190 






0.76S1S 




-0 


0.73810 




-al- 


0.75758 




-1 


0.78571 




- riU • 


0.73106 




-r 


0.75000 




“Ol* 


0.71212 




0 


0.77976 


C 


-P 


0.71429 


S 


1 


0.80357 




-b 


0.70238 




2 


0.62987 




-t 


0.72619 




3 


0.91558 




-d 


0.75000 




4 


0.54058 




-k 


0.76190 




5 


0.62825 




-g 


0.76190 




6 


0.88636 



I * Syllable Initial Consonant 
V * Vowel Nucleus 
C * Final Consonant 
S * Subject 
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'tlio stiiuuXus syXXsbX© I "to lisv© 9 st9txstic3XXy sij^mfxcEnt 
main effect on the perception of the syXXabXe-initiaX /X/ 

and /r/e (F * X.76; 2X, X05 d.f.; P<.05). 

Figure 4 shows the means for the 22 XeveXs of variabXe 

C in the order of decreasing proportion correct for the 
criterion task. 

The Duncan procedure as described in Winer (X962) , was 
appXied to determine which of the means of the finaX 
consonants were (statisticaXXy) significantXy different 
from the others. TabXe 3 contains a summary of anaXysis 
and the group score difference matrix used to determine 
whether the scores for one group are significantXy different 
from the others. The resuXts are shown in TabXe 4. 

SchematicaXXy the resuXts in TabXe 4 might be 
summarized as foXXowsi 

Group Number X 2 3 4 5 6 7 8 9 Xj, XX X2 

eSbptmzds g 0 1 

S I f I) 3 r n 

V c 

k 

Those XeveXs incXuded in groups underXined by a common Xine 
do not differ (significantXy) ; XeveXs incXuded in groups 
not underXined by a common Xine do differ. Thus, XeveXs 
for /!/ and 0 differ from XeveXs for /0, 6, 1/ but not 
/b, p, t, f, m, q, z, 3, d, s, r, v, g, n, JS, k/, and 
simiXarXy /0, &, i/ differ from /!/ and 0 but not from 
/b, p, t, f, m, q, z, 3, d, s, r, v, g, n, JS, k/. It 
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is predicted that the probability of such differences 
occurring as a result of random error would be less than II. 

The null consonant, 0, indicating the absence of final 
consonant, /C/, corresponded to relatively high perception 
scores as predicted. Although not necessarily greater in 
duration, the stimulus syllables of three phonemes are 1/2 
longer in terms of phoneme length than those of two phonemes. 
Additional stimulus length is associated with increased 
response difficulty unless the additional length provides 
information needed for solution to criterion. 

No helpful information would be provided by the 
addition of a /-C/ in most instances. In fact the addition 
of a /-C/ may distract and interfere with the necessary 
information in the first two phonemes. A final consonant 
probably has a slight effect on the preceding vowel, which 
in turn may have a slight effect on the transitions from 
the points of major constriction for /!/ and /r/ to the 
vowel nucleus. As the transitions are of great importance 
for the perception of /!/ and /r/, any additional change 
in them would introduce an additional souce of variability 
with the effect of static or interference in the system. 

As the Ss could regularly discriminate between 
syllable final /!/ and /r/ in pilot tests and were given 
the final consonant context on the score sheet, it is 
conceivable that Ss could benefit from the presence of a 
final /!/ or /r/ by means of comparison with the initial 



/!/ or /r/. The st^tmlus syllable would be viewed as a 
set of three stimuli and the task would be to establish 
whether the initial stimulus was the same as or different 
from the final one. Apparently the Ss could identify the 
final [1] as the phoneme /!/ better than [r] as /r/ , since 
there were higher identification scores for syllables 

ending in /!/ . 

The Japanese Ss nearly always indicated a final [r] 
in pilot pronunciation tests of the stimulus syllables by 
[a]. (This observation is made in Kimuzuka (1962) as well.) 
We might presume the pronunciation of /-or/ as [-oa] for 
example, to be indicative of perception difficulties. If 
the Ss thought they had heard 0 instead of final /-r/, they 
would employ the same hypotheses for perception of /IV/ + 

/r/ as for /IV/ + 0 resulting in a decrease in performance 
scores. The mean proportion correct for stimulus syllables 
/IV/ + /-r/ was .762, somewhat below that of .786 for /IV/ 

+ 0 and .792 for /IV/ + /-!/• None of these differences 

were statistically significant however. 

The low scores for /0, 6, ^/ could not be explained 

on the basis of occurrence in Japanese or phonological 
characteristics. The /0/ and /6/ corresponding to a score 
of .661 do not occur in Japanese, but neither does /v/ 
which corresponds to a score of .756. On the other hand, 

/I/ is a moderately frequent consonant phoneme in Japanese 
and corresponds to a score of only .667. No simple grouping 
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according to articulatory position or manner as well as 
distinctive features seemed to work well, because a group 
including /©, b, 1/ would also include /s, %! corresponding 
to scores of ,756 and ,744 in the mid-high portion of the 
scale. C/6, 6, S/ differ from /s, r, v, k, n, g, 0» 1/ 
at the .05 level according to the Duncan procedure.) 

In the absence of a phonological grouping that would 
correspond to the grouping of response scores, it was felt 
the perception of "real” as opposed to ”non-real” stimulus 
would have to be examined. A "real” stimulus syllable would 
presumably be one that was a real word in one English 
dialect or another, or at least occurred in a real word. 

The point of comparing responses to "real" with "non-real 
is derived from the assumption that the Ss will have been 
exposed more often to the "real" stimulus syllables. Some 
kind of estimate of the frequency of occurrence in English 



would provide further information as to exposure. 

Therefore, a list of word examples was compiled 
meeting the following criteria (cf., Appendix B) . A 



standard pronunciation of the word 

a) is a stimulus syllable, 

b) is a stimulus syllable plus another consonant. 

(Clustering of the measure variable /!/ 
initial position was not 

and /r/ in cluster have ratner different phonemic 
shapes which give rise to substantially lower 
response scores in pilot tests.) 

c) is a stressed syllable in a polysyllabic word. 
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Proper names were excluded from this list. The corres- 
ponding Rinsland (1945) total-count defined frequency of 
occurrence was included as a rough estimate of the ^s 
exposure to those words meeting criterion (a) . Due to 
the large number of words that could possibly meet criteria 
(b) and (c), it was not possible to estimate the frequency 



of occurrence. 

CThe words in the Rinsland list were generated by 
children in Grades I -VI 1 1 throughout the Umted 
States primarily as w-J^itten material.^ Only the 
Grade I tabulation was supplemented with oral 
material. The range of total running words was 
from 350 thousand for Grade I to 1+ million for 
Grade V^II, the range of total different words 

from 5+ thousand for Grade I ^^^ut 18 
for G-^ade VIII. The total-count defined frequency 
represents the numerical total of the occurrence 
for each grade even though the number of words 
collected for each grade was not constant, mis 
weights the sum toward Grade VIII usage where 
the total number of words was greatest. At best 
the use of the total-count defined frequency is a 
rough estimate of the exposure or possible exposure 
to these English words.) 

On the basis of the information in this list the 
following counts and comparisons were made: 



Group A: 

The tabulation for Group A included the responses to 
stimulus syllables that were the standard dialect pronun- 
ciations of words meeting criterion (a) and having a 
Rinsland frequency greater than 0. Thus, for Group A only 
the responses to /lak/ pronunciation of "lock” were 
included in the tabulation, while for other counts the 
responses to both /lak/ and /lok/ were included. The 
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response scores for the “real" stimulus syllables were 
tabulated separately from the rest of the syllables, termed 
“non-real," for each S and then compared using the dependent 

measures t-test. 

Table 5 contains the mean proportion correct S-by-S 
for the “real" versus the “non-real" stimuxus syllables 
as defined above, the computed t- score and P. Even though 
the t-test yields a relatively small P, i.e., is a relatively 
non-conservative test, P is still greater than .10# 

Group B: 

The tabulations for Group B included the responses to 
stimulus syllables used by many native speakers o£ English 
at the University. Thus, both /lak/ and /lok/ were 
considered pronunciations of "lock." The mean proportion 
correct for each of the five groups of words was computed 
for each of the six Ss. The t-score was computed for every 



possible pair 



Stimulus syllables Group 1 consisted of pronunciations 
to all words in the list satisfying criterion CaJ 
reeardless of Rinsland frequency. These syllables 
might be regarded as "real" words. 

Stimulus syllables Group 2 consisted of 
to words satisfying criteria (b) or (cj but not Caj. 
The syllables in this group occurred as sequences of 
phonemes in stressed position, but were not mono- 
syllabic words in themselves. 

Stimulus syllables Group 3 consisted of all those 
syllables for which there was no f 

criterion (a) , (b) , or (c) . These syllables were 

neither “real" words nor sequences. 
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TABLE 5 



T-score for Non-Independent Measures Between 
‘•Real” and "Non-Real" Stimulus Syllables, 
Computation Group A 



Mean Correct 




S2 


S3 


S4 


S5 


S6 


for 














Real 


.805 


.590 


.900 


.545 


.628 


.870 


Non-real 


.803 


.710 


.949 


.535 


.644 


.917 


d.f. » 5 




for a* 


.5 

.4 

.3 

.2 

.1 

.05 


* 


.727 

.920 

1.156 

1.476 

2.015 

2.571 




^computed * 


1.58 




.1 


< P < 


.2 
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Stimulus syllables Group 4 consisted of stimulus 
syllables Groups 2 and 3. All these syllables did not 
occur as English monosyllabic words themselves. 

Stimulus syllables Group 5 consisted of stimulus 
syllables Groups 1 and 2. All these syllables occurred 
as phonemic sequences in the stressed position of 
English words regardless of the number of syllables 
in the word. 

A matrix containing the t-scores for all possible 
paired combinations of stimulus syllables Groups 1 to 5 
can be found in Table 6. The largest t-score between 
independent groups with no stimulus syllables in common 
was 1.873 for stimulus syllables Groups 1 and 2--”real" 
words and occurring sequences. The smallest t-score was 
between stimulus syllables Groups 2 and 3- -occurring and 
non-occurring sequences. For all comparisons P was greater 
than .10, therefore no significant differences were found 
in the responses to various stimulus syllable groups. 

Therefore, the results for both tabulation Groups A 
and B are for the most part inconclusive. However, it must 
be pointed out, that there is little to indicate that the 
occurrence of a stimulus syllable as a phoneme sequence in 
the stressed position in itself, contributed to the 
perceptability of the initial /!/ and /r/ if the stimulus 
syllable was not a "real” word in itself. 

The various words in the list grouped by criterion 
were broken down further into groups on the basis of the 
final consonant. The numbers of words in the 22 final 
consonant groups were counted for each of the criterion 
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TABLE 6 



T-scores for Non-independent Measures between Pairs of 
Stimulus SyUabXe Groups for 6 Subjects, Computation 

Group B 



Stimulus 

Syllable 

Groups 


12 3 


4 


5 


1 


1,873 1.249 1 


.563 


1.871 


2 


- «0,019 -0 


.019 


-1.874 


3 


0 


O 

CO 


-0.861 


4 
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-1 . 155 


5 
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d.f. » 


S for 0 » ,2 

» ,1 


to- 


1.476 

2.015 


.1 < 


P for all pairs 








TABLE 7 






Pearson- R Correlations Matrix for 
Score Groups 1 through 6 




Groups 1 


2 3 4 


5 


6 


1 


0,262 0.513 0.546 


0.123 


-0.552 


2 


0.647 0.699 


0.007 


-0.636 
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0.906 


0.189 


-0.908 
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«• 


-0.039 


-0.888 


S 
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-0.425 
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groups of words, and then compared with the total number 
of correct responses and Rinsland frequency for monosyllabic 
words in each of the 22 final consonant groups. 

Table 7 contains the Pearson-R correlations for the 

following paired observations*. 

1. Total score correct out of 168 possible for all 
stimulus syllables. 

2. Rinsland total-count defined frequency of words 
meeting criterion (a) . 

3. Number of monosyllabic words satisfying criterion 
(a) and having a Rinsland frequency greater than 

zero. 



4. 



5. 



6 . 



Number of stimulus syllables corresponding to words 
in the list that satisfy criterion (a) regardless 
of the frequency. 

Number of stimulus syllables corresponding to words 
in the list satisfying criteria (h) or (c) , but 
not (a) . 

Number of stimulus syllables which correspond to 
no words in the list, i.e., those not included in 

4 or 5. 



Although there is no significant difference in the mean 
proportion correct for various types of "real" versus 
"non-real" or "non-occurring" stimulus syllables, there is 
a positive correlation between the number of correct 
responses and the number of monosyllabic words satisfying 
criterion (a) . 

As mentioned above, the numbers of words represented 
by the stimulus syllables, the frequency of these in English, 
and the number of stimulus syllables that represent words 
are imperfect measures of what we might like to call ihe 
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S*s exposure to the stimulus syllables in his everyday 
experience with English. The number of stimulus syllables 
representing words, the numbers of words (which are highly 
related to the first) , and the frequencies of the words are 
very low for stimulus syllables /IV/ + /©, 5, 1/ for which 
the lowest response scores were observed. 

Figures 5, 6, and 7, contain graphs of the relations 
between variables 1 and 2, 1 and 3, and 1 and 4. 

The Pearson-R correlation is actually not appropriate 
in the sense that any correlational deviation from a 
straight line reduces the size of R as well as high 
variability and small ranee. Threshold jump is a distinct 
possibility here. It is evident that the relation between 
1 and 2 is low especially because of the high error variance. 
The relation of 1 and 4 is approximately zero for the 
majority of the final consonant groups while the few groups 
for which there are few stimulus syllables that are 
monosyllabic words have the lowest response scores. 

Although there was no general effect for "probable 
exposure" to the individual syllable as was apparent from 
the low t-scores, there seems to be a possible effect on 
the response scores for a final consonant group if the 
probable exposure to stimulus syllables of that group is 
sufficiently low. 

The experience the Japanese S^s have had with a 
particular stimulus syllable prior to the experiment is in 
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Sm of Rinslind Total-Count Defined Py;^R“«”cles for the 
Response Score Croups as a Function of Total Number Correct 





Total No. Correct out of 168 
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FIGURE 6 

Number of Stimulus Syllables Occurring in Stroked Posits ans 
of "Rea. " Words as a Function of Total Number Correct 
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Kupbei' of Stipulus Svllables that are 
English Monosyllable Words as a 
Function of the Total Nupber Correct 
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Total No. Correct out of 168 
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part a function of his experience with English, if it occurs 
in English and how often it occurs in English. Therefore, 
it would be expected that the performance of the Ss on 
groups of stimulus syllables would be a function of the 
number of stimulus syllables experienced and the frequency 
of experience with each. 

One plausible conjecture is that lack of exposure to 
a particular type or group of stimulus syllables prevents 
the S from making some of the generalizations needed for 
successful initial consonant identification. If the S 
has been exposed to several of the stimulus syllables of 
that group fairly frequently, he arrives at some generali- 
zations aiding perception. Further exposure might make for 
little improvement in performance once the threshold has 
been crossed. 

Naturally as the overall appropriate experience in 
English is obtained, more and more of the types are mastered 
until, as was apparently the case with the native speakers 
of English, there is no longer any difference in the total 
correct for various contexts. 

As Japanese students apparently scored significantly 
lower in this task only for types that rarely occur in 
English, there is little to encourage designers of instruc- 
tional programs to drill them. The rarer types are 
encountered less often, are for the most part less important 
and are therefore mastered only later in the experience of 
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the second- language learner. 

The analysis of variance also showed the A x B 
interaction, the interaction betweei^ the syllable initial 
consonant, I, and syllable-medial vowel, V, to be highly 
significant. (F « 2.57; 21, 65 d.f.; P<.01) 

Figure 8 shows in graph form the score correct out of 
a possible 132 correct for each of the 28 interaction 
pairs. The vowels of the pairs are placed at even intervals 
along the horizontal axis in the order of decreasing 
difference between corresponding /!/ and /r/ interaction 
pairs. The difference was taken to be the score for the 
/r/ interaction pair minus the score for the /!/ interaction 
pair. 

Difference Score « Score - Score 

/r+V./ /1+V./ 

The difference scores for vowel nuclei /ol, e, i/ is there- 
fore negative, the rest are positive. (For convenience of 
notation, the vowels have been numbered in order of the 
decreasing difference score (cf.. Figure 8). 

Saheffd^e Teat 
(Edwards, 1960) 

The difference in total correct between /IV]^.^/ , 

C/l 0 , II, la, Iffl, lal, leO and /rV^.g/, C/ro, rl, ra, ras, 
ral, re/), was significant at the .01 level as determined 
by Scheff^’s test for multiple comparisons. First, the 
difference between /ISf^/ and /rV;^/ C/le/ and /ro/) was 
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Total Kuaber Correct out of 132 Possible 



FIGURE 8 



Score Correct in Tot^l and Percent as a Function 
of CV interaction Pair 




Syllable Nucleus 
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'tos'tGid found to bo non* significant# each subsec|uent 

tes t I the next was added to the totals for /^^i-n^* 

The highest score, A » 11 . 85 , was obtained for the differe ce 

in total score correct between /IVx-e/ and . Upon the 

inclusion of /IV7/ and /rVy/ the magnitude of A decreased, 

because the additional difference failed to increase the 
value of the numerator, d^, sufficiently to offset the 
increase of two of the a^ term in the denominator. 

Artioulatopy DQsaviption 

Vowels /Vx. 6 /, (/©, I, a, a, al, e/) , corresponding to 
large statistically significant difference scores, are all 
unrounded central and central-front vowels. The remaining 
vowels corresponding to smaller non-significant difference 
scores are divided into two groups, /V7.X2/ * C/®U, U, 0, 
u, 0, o"l/), which either are or include a rounded bach 
vowel and /V13.14/, (/e,i/), which are unrounded high-front 
long diphthongized vowels (cf . , Figure 9 ). The difference 
scores for diphthongs /al , aU, ol/ correspond more closely 
to that expected for the first phoneme rather than the 
second. Thus, /ol/ corresponds to a very small difference 
score, while /aU/ to one somewhat higher and just below 
the difference score for /e/. The difference score for 
/al/ falls well within the range of the significant group. 

One S reported hearing an "o-like" sound in connection 
with /r/. If the S really did hear a type of rounding in 
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connection with /r/ and not with /!/, higher identification 
scores would be expected for /r/ before unrounded vowels 
than for /r/ before rounded vowels. The lack of “rounding" 
for /!/ should therefore lead to lower identification scores 
for /!/ before unrounded vowels, higher scores for /!/ 
before rounded /owels. This much is borne out in general 
by the data. Yet, the absence of the two long high front 
unrounded vowels from the unrounded group, in fact their 
occurrence at positions expected for vowels of maximum 
rounding remain totally unexplained. (Attempts to relate 
the Japanese [r] , palatalized alveolar flap before /i, y/ , 
seem highly contrived.) 

The location of /V^.g/ and the remaining vowels in 
articulatory (perceptual) space (Bjdrkhagen, 19S6) is 
shown in Figure 9. Figure 10, likewise shows the location 
of the various vowels in articulatory (perceptual) space, 
but includes information as to the magnitude of /!/ versus 
/r/ syllable difference scores for each of the various 
vowels. Thus, the innermost enclosed area contains the 
vowel /©/ associated with the greatest response score 
difference between /!/ and /r/ syllables (cf., Figure 8), 
the next highest difference score and so on. 
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PlfiURE 9 

Grouping of English Vowels According to IV-rV 
Response Score Difference 




NS Non-slgnlficsnt 



FIGURE 10 

Sylluble Nuclei Grouped According to [r°] Minus [1°] 
Response Score Magnitude 
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Di80U86ion of Somo Acoustic Fpopcptiss 
of Stimulus Syllables in Relation to Besfonse Scopes 

Descpiption 

frequencies are generally higher for /Vj.g/ than 
for the remaining vowels, F2 frequencies for /V^.g/ range 
from 11 to 18 hundred cps, (cf,, Figure 11 ), F2 frequencies 
for /Vg.jl/ below 11 hundred, those for /V13.14/ 

F3 frequencies for range from 22 to 26 hundred cps, 

F3 frequencies for /Vg-n/ tend to lie below those for 
/Vf.g/ while F3 frequencies for /V^i3-i4/ tend to lie above 
those for It must be noted however that the range 

and differences of F5 frequencies for the various vowels is 
realtively small compare! with the range and differences 
of F2 frequencies for the same vowels. 

Fl frequencies of initial /!/ and /r/ are essentially 
the same regardless of which vowel nucleus follows 
(cf,, Figure 12 ). The range and variability of F2 
frequencies are greater for initial /!/ than for initial 
/r/, but there is no difference between F2 frequencies of 
/!/ and /r/ followed by /V^.g/ and those followed by the 
remaining vowels. The major difference between the formant 
frequencies of initial /!/ and /r/ lie in F3. As was the 
case for F2 frequency of /!/, the F3 frequencies of /!/ 
have a greater range and variability than the F3 for /r/ 
in the same contexts. This variability is very small 
compared to the F3 frequency differences between initial 
/!/ and /r/ however. 
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Syllable Nucleus 



Any hypothesis of acoustic parameters that corresponds 
to the response scores (RS) would have to account for high 
RS for initial /r/ followed by /Vx-6/> initial 

/!/ followed by /V^-g/ and intermediate scores of less 
difference between the RS for /!/ + and /r/ + 

/V7.I4/ (cf., Figure 8). 

While the Fx frequencies are generally higher for 
/IVj.g/, Figure 11 , than for the remaining vowels, and this 
corresponds to the magnitude of the difference scores, the 
Fx frequencies for /Vx-5/ preceded by /!/ are the same as 
Fx frequencies for preceded by /r/. Therefore, the 

Vi frequencies fail to correspond to the difference in the 
RS for stimulus syllables beginning with /!/ from those 
beginning with /r/. 

The same argument may be used for the F2 frequencies 
of /Vx-5/ as compared with those of the remaining vowel 
nuclei. While a certain range of F2 frequencies from 11 
to 18 hundred cps. contains F2 frequencies for /Vx-g/ 
and the F2 frequencies for the remaining vowels lie either 
above 18 hundred or below 11 hundred cps., there is nothing 
to account for the CV interaction observed. 

F^ and F2 frequencies for initial /!/ and /r/ do not 
differ on the average although the F2 frequencies for the 
various [ 1 ] varied more than those for [r] . The greater 
variability of the ?2 frequencies for [1] allophones would 
account for slightly lower RS for syllables beginning with 
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/!/, but would not account for the other variations. While 
the Fj frequency for /!/ is considerably higher than the 
same for /r/i this does not suggest which of the two would 
have the higher RSj neither does it account for other 
variability. Any one of the acoustic parameters for C or 
V alone would only account for a main effect of C or V 
respectively. 

In order to account for the CV interaction where there 
are significant differences between RS for /!/ and /r/ 
syllables containing and nonsignificant differences 

between /!/ and /r/ syllables containing the other vowels, 
the acoustic parameter of the various levels of C must be 
compared with those of the various levels of V. Appro- 
priate dimensions for such a comparison might be the 
direction %.nd magnitude of the various F frequencies from 
a particular level of C to a particular level of V. The 
direction and magnitude of the transitions are shown in 

Figure 13 . 

Since the Fjl and F2 frequencies for the various levels 
of C and V did not differ systematically in terms of 
magnitude and direction with regard to the /!/ and /r/ 
syllables, the F;|^ and F2 frequency transitions do not 
account for the differences in RS of /!/ versus /r/ 
syllables. (The broken line representing the /!/ syllables 
corresponds closely to the solid line representing /r/ 
syllables for and DF2) • 
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FIGURE 13 



Direction and Magnitude of Transition for Initial /!/ and /r/ 
to Various Following Vowels as a Function 
of CV Interaction Pair, Lehiste (1964) Data 




Syllable Nucleus 
sequences starting with /!/ ' — • 

sequences starting with /r/ 
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The DF3, the F3 transitions from C to V, do however, 

For alX V except /e, i/ the F3 transitions arc opposite 
in direction and differ in magnitude from CV pair to CV 
pair Ccf,, Figure 14 ). For /r/ + the F3 transitions 

are positive and large, for /!/ + /V1.5/ they are negative 
and small. For /r/ + the F3 transitions are 

positive, but less in magnitude than for /r/ + /Vx.g/. 

For /!/ + the F3 transitions are negative and 

greater in magnitude than those for /!/ + /V1.5/. The F3 
transitions for /r/ differ in magnitude, but 

not in direction. While the difference in magnitude is 
rather great, it is expected that the absence of the 
directional cue would reduce the RS scores. As seen in 
Figure 8 the RS differences lie in the non-significant 
group. 

Figure 15 shows the magnitudes of the various F3 
transitions where the directions are opposite, i.e., 
including /Vx3-x4/» The order of the vowel nuclei is the 
same as in Figure 8. For the sake of comparison. Figure 16 , 
a graph of the various RS is presented immediately below. 

The mean, DF3, F3 transitions for /!/ and /r/ syllables of 
the NS group are more different than the corresponding 
mean RS for NS /!/ and /r/ syllables. As seen in Figure 17 , 
the exceptionally high score for the CV pair /re/ would 
not have been expected on the basis of the magnitude of 
the F3 transition. The students may have been given /r/ 
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recognition practice, and therefore tended to mark /r/ for 
syllables containing /©/ if they were in doubt. Actually 
a single S with an R bias when the syllable nucleus was 
/©/ could have caused the deviation. Yet the deviation 
of the /r©/ RS from that expected on the basis of the Fj 
transition lies within expected limits. 

Since there were indications the Ss * exposure to 
various levels of C correspondence to RS, estimations of 
exposure were also made for the 28 CV interaction pairs. 
Their estimations were then graphed in Figures 18, 19, and 
20. Figure 18 shows the Rinsland tucal-count defined 
frequencies for the various interaction pairs; Figure 19 
shows the number of different stimulus syllables found 
in the stress position of English words, and Figure 20 the 
number of stimulus syllables which constitute English 
monosyllabic words. The data are presented in order to 
decreasing RS difference between /!/ and the /r/ syllabi© 

for each syllable nucleus. 

There appears to be no (significant) correlation 
between the various estimations of exposure and the RS 
for the various CV pairs. While the frequencies associated 
with /!/ and /r/ syllables containing /V 1 . 5 / have a higher 
mean frequency (cf.. Figures 18 and 19) and this corresponds 
to the magnitude of the response difference score, it is 
directly counter to the RS scores themselves. Generally, 
the frequencies of occurrence for /!/ /Vj^. 5 / are as high 
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as the frequencies for /r/ + 

/Vj^.g/ are significantly lower than those for /r/ + 
at the .01 level according to the Scheffe procedure. 

There is no observable effect on the RS for /!/ and 
/r/ syllables containing vowels /e, 0/ which could be 
related to the free variation of [1] and [r] before those 
vowels in some speakers of Japanese (cf., Figure 21). The 
distribution of the Japanese allophone before the 
Japanese /i, e/ parallels the apparent difference reversal 
between the RS of /!/ and /r/ syllables containing those 
vowels. It seems less likely that the RS were caused by 
that allophonic distribution than that both the Japanese 
[r] before the Japanese /i, e/ and RS for /!/ and /r/ 
syllables containing /i, e/ correspond in some way to the 
high F2 and F3 positions of these Japanese and English 
vowels . 



Summary of the Results 

An analysis of variance showed the final consonant 
context in CVC syllables to have a significant main effect 
on the identification of initial /!/ and /r/. The final 
consonants /0, 8, corresponded to significantly lower 
/!/ and /r/ identification scores than final consonants 
/I, 0/ at the .01 level according to the Duncan procedure. 
(0 indicates the absence of a final consonant.) 

As expected, the presence of a final consonant 
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probably hinders the identification of initial /!/ and /r/. 
Thus, a relatively high identification score was observed 
for syllables with no final consonant. 

Since the Ss were given the final consonant context 
on the score sheet, the task was changed from one of 
identification to one of comparison for final /!/ and /r/. 
Then the task was to determine if the initial consonant 
was the same as or different from the final consonant. 
Apparently the initial and final varieties of /!/ were more 
comparable for the Ss than those of /r/, although the mean 
number correct for CV/r/ was not significantly different 
from that of CV/1/ at the .10 level according to the Duncan 
procedure. 

There was no evidence of any significant difference 
in responses to ”real’* and ”non-real” syllables in general. 
It was thought that identification of /!/ and /r/ might 
have been extra low for syllables ending in /0, S, S/ 
because of their extreme rarity. A threshold effect 
requiring a certain amount of contact with that syllable 
type in order to identify the initial /!/ and /r/ nominally 
might yield such results. 

At best the effect for /C/ was weak. As the lowest 
scores occurred for the rarest syllable types-, no major 
emphasis on training these syllables would seem to be 
warranted. 

There was no main effect for the post /!/ and /r/ V 



(vowel) context. In other words, there was no group of 
one or more vowels for which the identification scores of 
both /!/ and /r/ were significantly higher or lower than 
any others. However, there was a very strong interaction 
effect between the initial C and the V. 

/r/ before /a, I, a, ae, al , e/ was identified 
significantly better than /!/ before /a, I, a, ©, al, e/ . 

That is, /r/ seemed to be identified significantly better 
before unrounded vowels than did /!/ . Yet, conspicuously 
absent from that generalization were the unrounded high 

front /i, e/. 

Examination of the third formant transitions from /C-/ 
to /■’V”/ as reported in Lehiste (1964) revealed the third 
formant transitions from /!/ and /r/ to /i, e/ differed in 
magnitude, but not in direction. Third formant transitions 
from /!/ and /r/ to /a, I, a, ©, e/ differed in boj^i. The 
identification of initial /!/ and /r/ bears close relation 
to the magnitude of the third formant transition from /C-/ 
to /-V-/ providing the transitions from /!/ and /r/ for that 
context are opposite in direction. There was no relation 
between the identification scores and exposure measures for 
the CV interaction pairs. 

Apparently the Ss are acting in much the way an acoustic 
detector would if it were to utilize only third formant 
magnitude and direction cues for identification. The great 
difference in third formant starting positions for /!/ and 
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/r/ as well as abruptness and duration characteristics 
seemed not to have been perceived or at least seemed not to 
have been effectively utilized. 

Apparently there is some low level means whereby the 
listener can infer the less perceivable and less variable 
characteristics. Otherwise the listener would be left with 
almost unending complexity and the perception for native 
speakers would be far more context dependent than the pilot 
tests indicated. 

The Japanese student listeners in this experiment seem 
to have not yet found themselves in such control of: 

1) basic cue detection, such as the detection of less 
context-variable "dynamic" cues; or 

2) low level automatic processing of context-variable 
cues into stable cues, so that perception would be 
independent of context. 

Deficiency in (1) would point toward the need for more 
training in detecting and utilizing those cues. 

Deficiency in (2) would point to the need for improved 
identification of preceding and postceding phonemes as well 
as practice in relating various context-variable cues to 
one phoneme. 

Which of the two would be the goal of instruction to 
improve the identification of /!/ and /r/ cannot be 
determined here. Either, if possible, would be sufficient. 



61 



Miiiii 



IP 






ET 



REFERENCES 



Anderson, T. R. A case for contrastive phonology. ^ 

JntevnatirOna'l Review of ATp'p'iied Linguietioe , 1964, 

2, 219-230. 

Bjdrkhagen, I. Modern Swedish Grammar. Stockholm, Sweden: 
Svenska Bokffirlaget , 1956. 

Bloch, B. Studies in Colloquial Japanese IV. Language^ 

1950, 86-125. (Also in Joos, M. Readings in 

Linguistics I, 4th ed.) 

Calearo, E., ^ Lazzaroni, A. Speech intelligibility in 

relation to the speed of the message. Laryngoscope, 
1957 , 410-419. 

Edwards, A. L. Experimental design in psychological research. 
New York; Holt, Rinehart, and Winston, 1960. 
(Revised Edition) 

Francis, W. N. The structure of American English. New York: 
Ronald Press Co., 1958. 

French, N. U. , § Steinberg, J. C. Factors governing the 
intelligibility of speech sounds. Journal of 
Acoustic Society of America, 1947 , 19_, 90-119. 

Hanson, G. Phoneme perception. Uppsala Universitets 
Kr sskri ft, 19^^ , 11, 109-14'’. 

Hardy, W. G. Problems of audition, perception and 

understanding. Volta Review, 1956, 289-300, 

309. 

Joos, M. Acoustic phonetics. Language Monograph No. 23, 

1948. 

Kimizuka, S. Problems in teaching English based upon a 
contrastive analysis of Japanese and English. 
Unpublished master’s thesis. University of 
California, Los Angeles, 1962. 

Nakajima, F. Comparison of Japanese and English. Paper 

presented at the Specialists’ Conference, September 
1956. Tokyo, Japan: Kenkyusha Ltd., 1957. 

Lehiste, I. Acoustical characteristics of selected English 
consonants. International Journal of American 
Linguistics, 1964, part IV. 



62 



Liberman, 

Lindquist) 

Lisker, L, 
Luescher, 

Moser, H. 

O'Connor, 

Prator, C 
Toyoda, M 

Rinsland, 
Winer, B. 



A. M., Delattre, P. C., Cooper, F. S., § Gerstman, 

L. J. The role of consonant-vowel transitions in 

the perception of the stop and nasal consonants. 
^eyGhologioal Monogpajphs » 19S^, 1-13. 

, E. J. Design and analysis of escpepiments in 
psychology and education* Boston*. Houghton 
Mifflin Co. , 1953. 

, Minimal cues for separating /w, r, 1, y/ in 
intervocalic position. }lovd» 1957, 1^, . 

E., § Zwislocki, J. Adaptation of the ear to 
sound stimuli. i!ouvnal of the Acoustic Society 
of America^ 1949, 135-139. 

M. , ^ Dreher, J. J. Effects of training on 
listeners in intelligibility studies. Jouvnal 
of the Acoustic Society of Amevica) 1955, 

1213-1219. 

J. D., Gerstman, L. J., Liberman, A. M. , Delattre, 
P. C., § Cooper, F. S. Acoustic cues for the 
perception of initial /w, j, r, 1/ in English. 

^oTd, 1957, 13, 24-43. 

. H., Jr. Manual of AmeHcan English pronunciation. 
New York: Holt, Rinehart, and Winston, 1957. 

(Revised Edition) 

. Toward a comparative phonetics of Japanese and 
English. Paper presented at the Specialists* 
Conference, September 1956. Tokyo, Japan: 
Xenkyusha Ltd*, 1957. 

H. D. A basic vocabulary of elementary school 
children. New York: MacMillan Co., 1945. 

J. Statistical principles in experimental design. 
New York: McGraw-Hill, 1962. 



63 



Appendix A 



You will hear one word for each number. The word will 
be one of the two words beside the number. 

Make a circle around the word you hear . If you are 
not sure make a circle around the word which is nearer to 
the one you hear. BE SURE you make a circle around one 

word and only one word. 

After five words you will hear a number. Be sure you 
are ready for that number after you hear it. 

Let*s do some examples. 



1. rane 


lane 


2. roge 


loge 


3. Ian 


ran 


4. raws 


laws 


5. lace 


race 


6. lear 


rear 



Are there any questions? 
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M -- stimulus syllabi© is a real English monosyllabic 
word without clustering. 

S -- stimulus syllable occurs only as a stressed syllable 
in a real English word. Clustering included here. 

M*-- Sometimes pronounced as one syllable. 



** Number of stimulus syllables. 
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